skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Smaranda"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Polysynthetic languages present a challenge for morphological analysis due to the complexity of their words and the lack of high-quality annotated datasets needed to build and/or evaluate computational models. The contribution of this work is twofold. First, using linguists’ help, we generate and contribute high-quality annotated data for two low-resource polysynthetic languages for two tasks: morphological segmentation and part-of-speech (POS) tagging. Second, we present the results of state-of-the-art unsupervised approaches for these two tasks on Adyghe and Inuktitut. Our findings show that for these polysynthetic languages, using linguistic priors helps the task of morphological segmentation and that using stems rather than words as the core unit of abstraction leads to superior performance on POS tagging. 
    more » « less
  2. Polysynthetic languages present a challenge for morphological analysis due to the complexity of their words and the lack of high-quality annotated datasets needed to build and/or evaluate computational models. The contribution of this work is twofold. First, using linguists’ help, we generate and contribute high-quality annotated data for two low-resource polysynthetic languages for two tasks: morphological segmentation and part-of-speech (POS) tagging. Second, we present the results of state-of-the-art unsupervised approaches for these two tasks on Adyghe and Inuktitut. Our findings show that for these polysynthetic languages, using linguistic priors helps the task of morphological segmentation and that using stems rather than words as the core unit of abstraction leads to superior performance on POS tagging. 
    more » « less
  3. Synthesis and isolation of molecular building blocks of metal–organic frameworks (MOFs) can provide unique opportunities for characterization that would otherwise be inaccessible due to the heterogeneous nature of MOFs. Herein, we report a series of trinuclear cobalt complexes incorporating dithiolene ligands, triphenylene-2,3,6,7,10,11-hexathiolate (THT) (13+), and benzene hexathiolate (BHT) (23+), with 1,1,1,-tris(diphenylphosphinomethyl)ethane (triphos) employed as the capping ligand. Single crystal X-ray analyses of 13+ and 23+ display three five-coordinate cobalt centers bound to the triphos and dithiolene ligands in a distorted square pyramidal geometry. Cyclic voltammetry studies of 13+ and 23+ reveal three redox features associated with the formation of mixed valence states due to the sequential reduction of the redox-active metal centers (Co III/II ). Using this electrochemical data, the comproportionality values were determined for 1 and 2 (log  K c = 1.4 and 1.5 for 1, and 4.7 and 5.8 for 2), suggesting strong resonance-stabilized coupling of the metal centers, with stronger electronic coupling observed for complex 2 compared to that for complex 1. Cyclic voltammetry studies were also performed in solvents of varying polarity, whereupon the difference in the standard potentials (Δ E 1/2 ) for 1 and 2 was found to shift as a function of the polarity of the solvent, indicating a negative correlation between the dielectric constant of the electrochemical medium and the stability of the mixed valence species. Spectroelectrochemical studies of in situ generated multi-valent (MV) states of complexes 1 and 2 display characteristic NIR intervalence charge transfer (IVCT) bands, and analysis of the IVCT transitions for complex 2 suggests a weakly coupled class II multi-valent species and relatively large electronic coupling factors (1700 cm −1 for the first multi-valent state of 22+, and 1400 and 4000 cm −1 for the second multi-valent state of 2+). Density functional theory (DFT) calculations indicate a significant deviation in relative energies of the frontier orbitals of complexes 13+, 23+, and 3+ that contrasts those calculated for the analogous trinuclear cobalt dithiolene complexes employing pentamethylcyclopentadienyl (Cp*) as the capping ligand (Co3Cp*3THT and Co3Cp*3BHT, respectively), and may be a result of the cationic nature of complexes 13+, 23+, and 3+. 
    more » « less
  4. Unsupervised cross-lingual projection for part-of-speech (POS) tagging relies on the use of parallel data to project POS tags from a source language for which a POS tagger is available onto a target language across word-level alignments. The projected tags then form the basis for learning a POS model for the target language. However, languages with rich morphology often yield sparse word alignments because words corresponding to the same citation form do not align well. We hypothesize that for morphologically complex languages, it is more efficient to use the stem rather than the word as the core unit of abstraction. Our contributions are: 1) we propose an unsupervised stem-based cross-lingual approach for POS tagging for low-resource languages of rich morphology; 2) we further investigate morpheme-level alignment and projection; and 3) we examine whether the use of linguistic priors for morphological segmentation improves POS tagging. We conduct experiments using six source languages and eight morphologically complex target languages of diverse typologies. Our results show that the stem-based approach improves the POS models for all the target languages, with an average relative error reduction of 10.3% in accuracy per target language, and outperforms the word-based approach that operates on three-times more data for about two thirds of the language pairs we consider. Moreover, we show that morpheme-level alignment and projection and the use of linguistic priors for morphological segmentation further improve POS tagging. 
    more » « less